Digitizing a Million Books: Challenges for Document Analysis
نویسندگان
چکیده
This paper describes the challenges for document image analysis community for building large digital libraries with diverse document categories.The challengesare identified fromthe experienceof theon-going activities toward digitizing and archiving onemillion books. Smooth workflow has been established for archiving large quantity of books, with the help of efficient imageprocessing algorithms.However,muchmore research is needed to address the challenges arising out of the diversity of the content in digital libraries.
منابع مشابه
A Mass Digitization Primer
Many people are talking these days about “digitizing books.” But what does that really mean? This paper describes different kinds of digitizing, the pros and cons of each, and suggests a layered structure for understanding “digitization.” The Digital Age has brought many challenges for librarians. Most obvious are all the issues concerning “born-digital” material that academics and the general ...
متن کاملA policy framework for the challenges of implementing regional higher education management in Iran
The models of regional governance in the world, particularly for administration of higher education are considered vital. In Iran, with the approval of Iran's Higher Education System Spatial Management Document, the issue of regional management in higher education was given special attention. Articles 1 and 2 of the document specifically address the regional higher education structure of the ...
متن کاملAn Analysis of Ministry of Education’s Strategic Plans Based on Favorable Components of English Language Teaching Using Shannon’s Entropy
The present research aims to analyze the content of Ministry of Education’s strategic plans (the Fundamental Reform Document of Education, the Comprehensive National Scientific Plan and the National Curriculum Document) based on Shannon's entropy regarding the favorable components of teaching English. The contents of the Fundamental Reform Document of Education, the Comprehensive National Scien...
متن کاملThe Personalized Services in CADAL Digital Library
CADAL is a great digital library project of digitizing one million digital books and publishing them to the internet users. It’s obvious that users confront with the information overload problem when visiting the CADAL portal. Therefore, we have been concerned with providing useful and flexible personalization services to reduce the users’ time and energy cost of finding interesting information...
متن کاملSlicing Books – The Authors' Perspective
While authors are still struggling to understand how to make best use of the potential offered by hypertext documents, computer science research proceeds to develop the next generation of the digital documents. This next generation will be based on richer semantics, more potential for automation and personalization and will pose new challenges to the authors. This paper aims to give a first imp...
متن کامل